Stumpy is a Python library designed for efficient analysis of large time series data. It uses matrix profile computation to identify patterns, anomalies, and shapelets. Stumpy leverages optimized algorithms, parallel processing, and early termination to significantly reduce computational overhead.
Outlier treatment is a necessary step in data analysis. This article, part 3 of a four-part series, eases the process and provides insights on effective methods and tools for outlier detection.
The use cases covered in the article include caching, queueing, locking, throttling, session store, and rate limiting.
The relationship between predictability and reconstructability, and how it can vary in opposite directions in complex systems. The work is based on information theory and was performed on various dynamics on random graphs, including continuous deterministic systems, and provides analytical calculations of the uncertainty coefficients for many different systems.
The article discusses the challenges faced in evaluating anomaly detection in time series data and introduces Proximity-Aware Time series anomaly Evaluation (PATE) as a solution. PATE provides a weighted version of Precision and Recall curve and considers temporal correlations and buffer zones for a more accurate and nuanced evaluation.
pg_timeseries is an open-source PostgreSQL extension focused on creating a cohesive user experience around the creation, maintenance, and use of time-series tables. It allows users to create time-series tables, configure the compression and retention of older data, monitor time-series partitions, and run complex time-series analytics functions with a user-friendly syntax.
This article describes how to use GNU Emacs for quick data visualization in combination with Gnuplot. It provides a command that can be used to visualize the correlation of data without needing any setup or specific files. The article also includes an example of a command for generating a graph using a data range selected with a rectangle command copy-rectangle.
The race to bring data into development is here. We see it in many ways, most noticeably in the increasing relevance of time series databases. InfluxDB, from InfluxData, is one of the leading open source time series database technology providers. The company is now partnering with Amazon Web Services to provide a managed open source service for time series database services.
PySpark for time-series data, discussing data ingestion, extraction, and visualization with practical implementation code.
This article provides a comprehensive guide to performing exploratory data analysis on time series data, with a focus on feature engineering.